A Bi-Criteria Approximation Algorithm for k-Means

نویسندگان

  • Konstantin Makarychev
  • Yury Makarychev
  • Maxim Sviridenko
  • Justin Ward
چکیده

We consider the classical k-means clustering problem in the setting bi-criteria approximation, in which an algoithm is allowed to output βk > k clusters, and must produce a clustering with cost at most α times the to the cost of the optimal set of k clusters. We argue that this approach is natural in many settings, for which the exact number of clusters is a priori unknown, or unimportant up to a constant factor. We give new bi-criteria approximation algorithms, based on linear programming and local search, respectively, which attain a guarantee α(β) depending on the number βk of clusters that may be opened. Our gurantee α(β) is always at most 9 + ǫ and improves rapidly with β (for example: α(2) < 2.59, and α(3) < 1.4). Moreover, our algorithms have only polynomial dependence on the dimension of the input data, and so are applicable in high-dimensional settings.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid DEA-based K-means and invasive weed optimization for facility location problem

In this paper, instead of the classical approach to the multi-criteria location selection problem, a new approach was presented based on selecting a portfolio of locations. First, the indices affecting the selection of maintenance stations were collected. The K-means model was used for clustering the maintenance stations. The optimal number of clusters was calculated through the Silhou...

متن کامل

A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++

This paper studies the k-means++ algorithm for clustering as well as the class ofD sampling algorithms to which k-means++ belongs. It is shown that for any constant factor β > 1, selecting βk cluster centers by D sampling yields a constant-factor approximation to the optimal clustering with k centers, in expectation and without conditions on the dataset. This result extends the previously known...

متن کامل

Exact algorithms for solving a bi-level location–allocation problem considering customer preferences

The issue discussed in this paper is a bi-level problem in which two rivals compete in attracting customers and maximizing their profits which means that competitors competing for market share must compete in the centers that are going to be located in the near future. In this paper, a nonlinear model presented in the literature considering customer preferences is linearized. Customer behavior ...

متن کامل

Greedy bi-criteria approximations for k-medians and k-means

This paper investigates the following natural greedy procedure for clustering in the bi-criterion setting: iteratively grow a set of centers, in each round adding the center from a candidate set that maximally decreases clustering cost. In the case of k-medians and k-means, the key results are as follows. • When the method considers all data points as candidate centers, then selecting O(k log(1...

متن کامل

طراحی و آموزش شبکه‏ های عصبی مصنوعی به وسیله استراتژی تکاملی با جمعیت‏ های موازی

Application of artificial neural networks (ANN) in areas such as classification of images and audio signals shows the ability of this artificial intelligence technique for solving practical problems. Construction and training of ANNs is usually a time-consuming and hard process. A suitable neural model must be able to learn the training data and also have the generalization ability. In this pap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016